GEM: Glare or Gloom, I Can Still See You – End-to-End Multi-Modal Object Detection

نویسندگان

چکیده

Deep neural networks designed for vision tasks are often prone to failure when they encounter environmental conditions not covered by the training data. Single-modal strategies insufficient sensor fails acquire information due malfunction or its design limitations. Multi-sensor configurations known provide redundancy, increase reliability, and crucial in achieving robustness against asymmetric failures. To address issue of changing lighting degradation object detection, we develop a multi-modal 2D detector, propose deterministic stochastic sensor-aware feature fusion strategies. The proposed mechanisms driven estimated measurement reliability values/weights. Reliable detection harsh is essential applications such as self-driving vehicles human-robot interaction. We also new “r-blended” hybrid depth modality RGB-D sensors. Through extensive experimentation, show that outperform existing state-of-the-art methods on FLIR-Thermal dataset, obtain promising results SUNRGB-D dataset. additionally record RGB-Infra indoor namely L515-Indoors, demonstrate methodologies highly effective variety conditions.

برای دانلود باید عضویت طلایی داشته باشید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Multi-Modal Data Augmentation for End-to-end ASR

We present a new end-to-end architecture for automatic speech recognition (ASR) that can be trained using symbolic input in addition to the traditional acoustic input. This architecture utilizes two separate encoders: one for acoustic input and another for symbolic input, both sharing the attention and decoder parameters. We call this architecture a multi-modal data augmentation network (MMDA),...

متن کامل

DenseBox: Unifying Landmark Localization with End to End Object Detection

How can a single fully convolutional neural network (FCN) perform on object detection? We introduce DenseBox, a unified end-to-end FCN framework that directly predicts bounding boxes and object class confidences through all locations and scales of an image. Our contribution is two-fold. First, we show that a single FCN, if designed and optimized carefully, can detect multiple different objects ...

متن کامل

Knowing Afghanistan: Can there be an end to the saga?

War on terrorism, as the motto which formed the cornerstone of global policies of former     neo-conservative administration of the United States, is increasingly becoming ineffective in Afghanistan with the dreaded consequence of spilling over into Pakistan. This inevitable consequence of War on    Terrorism in Afghanistan has brought the West face to face with the ‘nest of terrorism’ that CIA...

متن کامل

Talk Louder So I Can See You

How does one sense influence the processing of another? In this issue of Neuron, Ibrahim et al. (2016) demonstrate that presence of sound sharpens neural tuning in the primary visual cortex via activation of direct inputs from the primary auditory cortex.

متن کامل

Can You See Where I Point at?

Pointing on public displays is usually done in a relative and indirect fashion. However, such techniques have two drawbacks: first, a personal pointer is shown on the public screen which decreases the users’ privacy in selection tasks. And second, when multiple displays are present, users need to connect to the one they want to interact with a priori. In this paper we review the strengths of To...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: IEEE robotics and automation letters

سال: 2021

ISSN: ['2377-3766']

DOI: https://doi.org/10.1109/lra.2021.3093871